impact function
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- North America > United States > New York (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Education (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.96)
- Banking & Finance (0.93)
- North America > United States > New York (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.96)
- Banking & Finance (0.93)
- Education (0.93)
Balancing Interpretability and Flexibility in Modeling Diagnostic Trajectories with an Embedded Neural Hawkes Process Model
Zhao, Yuankang, Engelhard, Matthew
The Hawkes process (HP) is commonly used to model event sequences with self-reinforcing dynamics, including electronic health records (EHRs). Traditional HPs capture self-reinforcement via parametric impact functions that can be inspected to understand how each event modulates the intensity of others. Neural network-based HPs offer greater flexibility, resulting in improved fit and prediction performance, but at the cost of interpretability, which is often critical in healthcare. In this work, we aim to understand and improve upon this tradeoff. We propose a novel HP formulation in which impact functions are modeled by defining a flexible impact kernel, instantiated as a neural network, in event embedding space, which allows us to model large-scale event sequences with many event types. This approach is more flexible than traditional HPs yet more interpretable than other neural network approaches, and allows us to explicitly trade flexibility for interpretability by adding transformer encoder layers to further contextualize the event embeddings. Results show that our method accurately recovers impact functions in simulations, achieves competitive performance on MIMIC-IV procedure dataset, and gains clinically meaningful interpretation on XX-EHR with children diagnosis dataset even without transformer layers. This suggests that our flexible impact kernel is often sufficient to capture self-reinforcing dynamics in EHRs and other data effectively, implying that interpretability can be maintained without loss of performance.
- North America > United States > North Carolina > Durham County > Durham (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering
How to cluster event sequences generated via different point processes is an interesting and important problem in statistical machine learning. To solve this problem, we propose and discuss an effective model-based clustering method based on a novel Dirichlet mixture model of a special but significant type of point processes -- Hawkes process. The proposed model generates the event sequences with different clusters from the Hawkes processes with different parameters, and uses a Dirichlet distribution as the prior distribution of the clusters. We prove the identifiability of our mixture model and propose an effective variational Bayesian inference algorithm to learn our model. An adaptive inner iteration allocation strategy is designed to accelerate the convergence of our algorithm. Moreover, we investigate the sample complexity and the computational complexity of our learning algorithm in depth. Experiments on both synthetic and real-world data show that the clustering method based on our model can learn structural triggering patterns hidden in asynchronous event sequences robustly and achieve superior performance on clustering purity and consistency compared to existing methods.
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
Time to Cite: Modeling Citation Networks using the Dynamic Impact Single-Event Embedding Model
Nakis, Nikolaos, Celikkanat, Abdulkadir, Boucherie, Louis, Lehmann, Sune, Mørup, Morten
Understanding the structure and dynamics of scientific research, i.e., the science of science (SciSci), has become an important area of research in order to address imminent questions including how scholars interact to advance science, how disciplines are related and evolve, and how research impact can be quantified and predicted. Central to the study of SciSci has been the analysis of citation networks. Here, two prominent modeling methodologies have been employed: one is to assess the citation impact dynamics of papers using parametric distributions, and the other is to embed the citation networks in a latent space optimal for characterizing the static relations between papers in terms of their citations. Interestingly, citation networks are a prominent example of single-event dynamic networks, i.e., networks for which each dyad only has a single event (i.e., the point in time of citation). We presently propose a novel likelihood function for the characterization of such single-event networks. Using this likelihood, we propose the Dynamic Impact Single-Event Embedding model (DISEE). The \textsc{\modelabbrev} model characterizes the scientific interactions in terms of a latent distance model in which random effects account for citation heterogeneity while the time-varying impact is characterized using existing parametric representations for assessment of dynamic impact. We highlight the proposed approach on several real citation networks finding that the DISEE well reconciles static latent distance network embedding approaches with classical dynamic impact assessments.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Denmark (0.05)
- Europe > Spain (0.04)
Short-term Temporal Dependency Detection under Heterogeneous Event Dynamic with Hawkes Processes
Chen, Yu, Li, Fengpei, Schneider, Anderson, Nevmyvaka, Yuriy, Amarasingham, Asohan, Lam, Henry
Many event sequence data exhibit mutually exciting or inhibiting patterns. Reliable detection of such temporal dependency is crucial for scientific investigation. The de facto model is the Multivariate Hawkes Process (MHP), whose impact function naturally encodes a causal structure in Granger causality. However, the vast majority of existing methods use direct or nonlinear transform of standard MHP intensity with constant baseline, inconsistent with real-world data. Under irregular and unknown heterogeneous intensity, capturing temporal dependency is hard as one struggles to distinguish the effect of mutual interaction from that of intensity fluctuation. In this paper, we address the short-term temporal dependency detection issue. We show the maximum likelihood estimation (MLE) for cross-impact from MHP has an error that can not be eliminated but may be reduced by order of magnitude, using heterogeneous intensity not of the target HP but of the interacting HP. Then we proposed a robust and computationally-efficient method modified from MLE that does not rely on the prior estimation of the heterogeneous intensity and is thus applicable in a data-limited regime (e.g., few-shot, no repeated observations). Extensive experiments on various datasets show that our method outperforms existing ones by notable margins, with highlighted novel applications in neuroscience.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California (0.04)
- Research Report > Experimental Study (0.68)
- Research Report > Promising Solution (0.67)
- Banking & Finance (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
- (3 more...)
COHORTNEY: Deep Clustering for Heterogeneous Event Sequences
Zhuzhel, Vladislav, Rivera-Castro, Rodrigo, Kaploukhaya, Nina, Mironova, Liliya, Zaytsev, Alexey, Burnaev, Evgeny
There is emerging attention towards working with event sequences. In particular, clustering of event sequences is widely applicable in domains such as healthcare, marketing, and finance. Use cases include analysis of visitors to websites, hospitals, or bank transactions. Unlike traditional time series, event sequences tend to be sparse and not equally spaced in time. As a result, they exhibit different properties, which are essential to account for when developing state-of-the-art methods. The community has paid little attention to the specifics of heterogeneous event sequences. Existing research in clustering primarily focuses on classic times series data. It is unclear if proposed methods in the literature generalize well to event sequences. Here we propose COHORTNEY as a novel deep learning method for clustering heterogeneous event sequences. Our contributions include (i) a novel method using a combination of LSTM and the EM algorithm and code implementation; (ii) a comparison of this method to previous research on time series and event sequence clustering; (iii) a performance benchmark of different approaches on a new dataset from the finance industry and fourteen additional datasets. Our results show that COHORTNEY vastly outperforms in speed and cluster quality the state-of-the-art algorithm for clustering event sequences.
- Research Report > New Finding (0.86)
- Research Report > Promising Solution (0.54)
- Banking & Finance (0.68)
- Health & Medicine (0.54)
Fair Bandit Learning with Delayed Impact of Actions
Tang, Wei, Ho, Chien-Ju, Liu, Yang
Algorithmic fairness has been studied mostly in a static setting where the implicit assumptions are that the frequencies of historically made decisions do not impact the problem structure in subsequent future. However, for example, the capability to pay back a loan for people in a certain group might depend on historically how frequently that group has been approved loan applications. If banks keep rejecting loan applications to people in a disadvantaged group, it could create a feedback loop and further damage the chance of getting loans for people in that group. This challenge has been noted in several recent works but is under-explored in a more generic sequential learning setting. In this paper, we formulate this delayed and long-term impact of actions within the context of multi-armed bandits (MAB). We generalize the classical bandit setting to encode the dependency of this action "bias" due to the history of the learning. Our goal is to learn to maximize the collected utilities over time while satisfying fairness constraints imposed over arms' utilities, which again depend on the decision they have received. We propose an algorithm that achieves a regret of $\tilde{\mathcal{O}}(KT^{2/3})$ and show a matching regret lower bound of $\Omega(KT^{2/3})$, where $K$ is the number of arms and $T$ denotes the learning horizon. Our results complement the bandit literature by adding techniques to deal with actions with long-term impacts and have implications in designing fair algorithms.
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
- Education (0.92)
- Banking & Finance > Loans (0.54)
Predicting Smoking Events with a Time-Varying Semi-Parametric Hawkes Process Model
Engelhard, Matthew, Xu, Hongteng, Carin, Lawrence, Oliver, Jason A, Hallyburton, Matthew, McClernon, F Joseph
Health risks from cigarette smoking -- the leading cause of preventable death in the United States -- can be substantially reduced by quitting. Although most smokers are motivated to quit, the majority of quit attempts fail. A number of studies have explored the role of self-reported symptoms, physiologic measurements, and environmental context on smoking risk, but less work has focused on the temporal dynamics of smoking events, including daily patterns and related nicotine effects. In this work, we examine these dynamics and improve risk prediction by modeling smoking as a self-triggering process, in which previous smoking events modify current risk. Specifically, we fit smoking events self-reported by 42 smokers to a time-varying semi-parametric Hawkes process (TV-SPHP) developed for this purpose. Results show that the TV-SPHP achieves superior prediction performance compared to related and existing models, with the incorporation of time-varying predictors having greatest benefit over longer prediction windows. Moreover, the impact function illustrates previously unknown temporal dynamics of smoking, with possible connections to nicotine metabolism to be explored in future work through a randomized study design. By more effectively predicting smoking events and exploring a self-triggering component of smoking risk, this work supports development of novel or improved cessation interventions that aim to reduce death from smoking.
- North America > United States > North Carolina > Durham County > Durham (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > India (0.04)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
- Education (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Data Science (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)